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This  report  describes  the  design  and  development  of  a  real-time  adaptive 
transform  coder  that  transmits  high-quality  speech  over  a  9600  bps  channel 
with  bit-error  rates  of  up  to  IX  without  significant  loss  of  speech 
fidelity.  The  report  presents  the  results  of  our  FORTRAN  simulations  on 
the  adaptive  transform  coder  which  maximized  the  quality  of  the  trans- 
mitted  speech.  Important  aspects  of  the  ATC  algorithm  which  are  optimized. 

(cont'd) 

00  ,'27»  Wl  «••*•••'  'wmikmt.  Unclassified 

wciiRW  EuBiflfifiow  otfSii  BH  7*5  — »  P— 


i  w*  c 


Unclassified 

»tcw«u»  tiMWitwiw|  o»  tm>  >»n  (mm  Of  a«— « 

20.  Abstract  (cont'd) 

^were  specification  and  transmission  of  the  side-band  Information,  accuracy 
of  the  pitch  and  voicing  decisions,  and  error-protection  of  the  Important 
transmission  parameters.  Also  Included  is  the  system  design,  detailed 
documentation,  and  program  listings  of  the  MAP-300  real-time  Implementa¬ 
tion  of  the  optimized  ATC  speech  coder.  Finally,  the  report  Includes  a 
description  of  analog  equipment  GTE  built  to  interface  the  MAP-300  to 
telephone  handsets  and  tape  recorders  and  a  description  of  digital  cir¬ 
cuits  (RS  423  compatible)  to  Interface  the  MAP-300  to  a  modem. 

This  report  is  bound  in  two  volumes.^  Volume  I  contains  a  description  of 
[  the  ATC  system  and  the  results  of  the  FORTRAN  simulations.  Volume  II  con- 
j  tains  all  the  information  on  the  real-time  system  includincP35cumentation 
|  for  implementing  the  ATC  system  on  the  MAP,  listing  of  theMAPs*$>ftware, 
and  documentation  for  the  hardware  built  by  GTE. 
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Chapter  I 


Summary  of  Program 


1.1  Introduction 

Under  the  9600  BPS  Speech  Optimization  Study,  GTE  Syl vania  simulated 
and  implemented  a  full  duplex  Adaptive  Transform  Coder  (ATC)  speech  digi¬ 
tization  algorithm.  The  simulations  were  performed  using  FORTRAN  computer 
programs  while  the  implementation  used  a  CSP,Inc.  Map  -300  floating  point 
array  processor  with  digital  and  audio  input/ouput  circuitry  designed  by  GTE. 

This  study  and  implementation  effort  has  resulted  in  a  number  of  sig¬ 
nificant  accomplishments  in  developing  speech  digitization  algorithms.  The 
most  important  of  these  include: 

a.  The  demonstration  via  FORTRAN  simulation  that  ATC  at  9600  bps  can 
produce  good  quality  speech  having  a  Signal-to-Noise  ratio  (S/N) 
of  about  17  dB. 

b.  Establishment  of  a  benchmark  speech  processing  technique  at  9600 
b/s  which  indicates  that  high  quality  speech  is  possible  at  this 
data  rate. 

c.  The  development  of  error  coding  techniques  which  will  permit  ATC 
to  function  at  a  bit  error  rate  (BER)  of  10-2  with  little  reduc¬ 
tion  in  S/N  using  (63,45)  BCH  codes. 

d.  The  design  and  implementation  of  analog  audio  circuitry  to  permit 
speech  to  be  input  to  and  output  from  the  CSP,  Inc.  MAP  processor 
from  microphones  and  tape  recorders. 

e.  The  design  and  implementation  of  a  digital  transmission  interface 
(RS  423  compatible)  to  the  CSP,  Inc.  MAP  processor  so  that  data 
can  be  sent  to  a  modem. 
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f.  Implementation  of  a  real-time  full  duplex  ATC  speech  digitizer 
on  the  CSP,  Inc.  MAP  processor  whose  block  diagram  is  shown  in 
Figure  1.1-1.  This  digitizer  performs  its  processing  with  float¬ 


ing  point  arithmetic  and,  does  not  compromise  numerical  accuracy. 

g.  Real-time  demonstration  of  ATC  in  the  presence  of  10“^  channel 
error  rate  without  significant  performance  degradation. 

The  voice  quality  produced  by  the  ATC  simulation  is  the  best  of  any 
technique  operating  at  9600  b/s  now  known  to  GTE.  The  technique,  whose 
specifications  are  shown  in  Table  1.2-1,  is  numerically  complex  requiring 
the  complete  processing  capability  of  the  CSP,  Inc.  MAP-300  floating  point 
processor.  Thus,  for  ATC  to  be  practical,  either  higher  speed  hardware 
must  be  built  or  the  technique  must  be  simplified. 

The  investigation  and  developments  leading  to  the  real-time  ATC  system 
proceeded  in  three  phases.  During  the  first  phase  the  ATC  algorithm  orig¬ 
inally  proposed  by  Zel  inski  and  No!  1 1»2  was  investigated  and  the  modifications 
proposed  by  Crochiere  and  Tribolet3’1*  were  incorporated  to  improve  voice  qual - 
ity.  Numerous  FORTRAN  simulations  were  conducted  to  optimize  performance 
with  respect  to  data  rate,  channel  error  performance  and  robustness  to  speak¬ 
er  and  room  noise.  At  the  end  of  the  first  phase  which  lasted  about  4  months 
the  ATC  algorithm  was  frozen  and  the  real-time  implementation  begun. 

Concurrent  with  the  first  phase  was  the  design  and  fabrication  of  the 
digital  and  analog  I/O  interfaces  to  the  CSP,  Inc.  MAP-300  processor.  In 
addition  to  building  our  own  units,  GTE  Syl vania,  under  separate  subcontracts 
with  BBN  and  Notre  Dame  University,  built  two  additional  units  for  incorp¬ 
oration  with  their  MAP-300  speech  processing  systems. 
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The  third  phase,  the  real-time  Implementation,  began  In  February  1979 
and  continued  until  August  1980.  During  this  time,  test  programs  for  the 


analog  and  digital  I/O  were  developed  and  numerous  software  and  hardware 
problems  with  the  MAP-300  were  resolved.  Finally,  In  the  summer  of  1979, 
the  first  working  modules  of  the  ATC  digitizer  were  operational  on  the  MAP- 
300,  and  it  was  at  this  time  that  the  scope  of  the  software  development 
project  became  apparent.  The  MAP-300,  for  all  its  speed  was  barely  ade¬ 
quate  to  perform  ATC  with  error  control  in  a  full  duplex  mode.  Consequently, 
from  August  1979  to  the  delivery  of  the  ATC  system  a  year  later,  considerable 
effort  was  placed  on  writing  efficient  MAP-300  real-time  software. 

The  final  ATC  system,  as  delivered  to  DCA,  indicates  that  a  full 
duplex  ATC  speech  processing  system  can  operate  on  ti.e  MAP-300  processor  in 
real-time. 

Future  speech  digitization  development  at  9600  cannot  ignore  the  ATC 
algorithm  because  even  though  the  technique  is  complex,  it  shows  that  good 
quality  speech  is  possible  at  this  data  rate.  Thus,  the  ATC  technique 
developed  under  this  contract  will  serve  as  a  benchmark  or  standard  to 
compare  all  new  9600  b/s  speech  digitization  algorithms. 

This  report  is  written  in  two  volumes.  Volume  1  contains  documenta¬ 
tion  on  the  ATC  simulations  while  Volume  2  contains  documentation  on  the 
real-time  software  and  hardware  I/O  circuitry. 
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Chapter  2 


Simulation  of  the  ATC  Algorithm 


2.1  Introduction 

Adaptive  Transform  Coding  (ATC)  was  originally  proposed  by  Zelinsky 
and  Noll  *«2  and  represents  an  efficient  block-coding  technique  for  speech 
digitization  in  the  8.0  to  16  K  b/s  range.  Early  simulations  of  the  ATC 
algorithm  at  GTE  Sylvanla  indicated  that  this  technique  was  capable  of 
producing  better  speech  quality  than  any  other  technique  at  9.6  kbps  known 
to  the  company  at  the  time.  When  DCA  requested  the  study  of  new  techniques 
at  9.6  kbps  GTE  Sylvanla  responded  with  the  ATC  algorithms  as  originally 
proposed  by  Zelinsky  and  Noll.  Later  articles  byTribolet  and  Crochiere3’  “ 
nowev^r,  indicated  that  further  improvements  were  possible  in  the  algo¬ 
rithm,  and  after  contract  award,  GTE  decided,  based  on  simulations  conducted 
under  its  IR&D  program,  to  develop  this  algorithm  even  though  it  was  about 
50%  more  complex  than  the  original  Zelinsky  and  Noll  design. 

In  this  Chapter  we  first  discuss  the  theory  of  ATC  operation.  Then  we 
discuss  the  simulation  and  optimization  of  this  system  and  the  need  for 
error  protection  and  correction  for  some  critical  transmission  parameters. 
Finally  we  discuss  the  results  of  the  simulation  with  and  without  error 
protective  coding  in  the  presence  of  channel  errors  as  high  as  one  error 
in  100  bits.  (A  BER  of  10'2. ) 

2.2  Basic  Principles  of  ATC  Operation 

In  its  basic  form,  ATC  consists  of  sending  the  largest  cosine  transform 
coefficients  of  a  segment  of  data  with  each  coefficient  quantized  according 
to  an  algorithm  that  gives  the  larger  coefficients  more  bits  than  the  smaller 
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coefficients.  This  ATC  algorithm  departs  from  earlier  algorithms  that  not 
only  had  to  send  the  amplitudes  of  the  coefficients,  but  also  had  to  send 
considerable  Information  about  which  coefficients  were  quantized  and  how 
many  bits  were  associated  with  each.  This  extra  Information  could  consume 
as  much  data  capacity  as  the  coefficient  amplitudes  themselves.  Attempts  at 
sending  only  specific  coefficients  or  the  use  of  a  fixed-bit  assignment  gen¬ 
erally  reduced  voice  quality  by  creating  waveform  discontinuities  at  the 
frame  boundaries  and  by  spectrally  distorting  the  signal  between  boundaries. 

In  ATC,  however,  information  about  which  amplitude  is  sent  and  how  many 
bits  are  allocated  to  each  Is  contained  In  the  basis  spectrum,  which  requires 
from  1200  to  2400  b/s.  This  basis  spectrum  generally  is  information  about 
the  envelope  of  the  transform  coefficients  being  quantized.  Its  calculation 
can  be  performed  by  the  smoothing  of  transform  coefficients  or  by  separate 
estimates  involving  least-square  analysis. 


To  understand  ATC,  consider  a  sampled  waveform  segment  shown  in  Figure 
2. 3-1  (a) .  If  this  waveform  is  multiplied  by  1/2,  delayed  by  ha  1  f  the  sampl  inq 
interval  T,  and  reflected  about  t=0,  it  yields  X2(t)  whose  Fourier  transform 
is  given  by: 


x2(f) 


N-l 

T. 

n=- (N-l ) 


xi(nT)  exp(-j2irf(n+l/2)T^ 


(2.2-1) 


If  we  sample  the  Fourier  transform  of  X2(f)  at  frequencies  jjqy  ,  the 
discrete  Fourier  transform  (OFT)  becomes 


X2(^r)  =  X2(m) 


N - 1  m 

E  X2(nT)  exp(-j^  (n+1/2)) 
n— (N-l)  n 


(2.2-2) 
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X2(t) 

♦ 


(b)  Reflected  Waveform 


i 

'  COSINE  SPECTRUM,  X2(k) 

tf 

\ 
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(c)  DFT  Output 

Figure  2.2-1:  Discrete  Cosine  Transform  Operation 
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Using  symmetry  properties  of  X2(nT),  X2(m)  shown  in  Figure  2. 2-l(b)  is 
real  only  and  is  given  by 


X2(m) 


N-l 

£ 

n~0 


Xi(nT)  cos  (5  (2n+l)) 


0  <  m  <  N-l 


(2.2-3) 


Equation  (2.2-3)  is  the  cosine  transform.  This  derivation  shows  that 
the  Fast  Fourier  Transform  (FFT)  can  be  used  to  implement  the  cosine  transform 
by  delaying  and  reflecting  the  original  waveform  and  then  taking  the  FFT  on 
a  waveform  twice  as  long  as  the  original. 


The  most  expensive  implementation  costs  with  the  ATC  algorithm  are  associ¬ 
ated  with  the  Discrete  Cosine  Transform  (DCT)  and  Discrete  Fourier  Transform 
(DFT).  Although  the  DCT  cannot  be  employed  directly,  methods  elaborated  by 
Ahmed  et  al6  and  Cooley  et  alr  use  the  DFT  to  compute  the  desired  transform. 

Our  FORTRAN  simulations  used  the  Cooley  method  for  DCT  calculation  and  a 
special  FFT  algorithm  to  lower  simulation  costs. 


After  calculation  of  the  DCT  coefficients,  the  basis  spectrum  (envelope 
of  the  cosine  transform)  can  be  estimated  by  making  all  the  cosine  transform 
coefficients  positive  and  smoothing  between  peaks  to  efficiently  send  the 
envelope.  We  can  quantize  the  amplitudes  of  every  mth  (m  is  typically  8) 
envelope  sample  and  send  those  as  the  coefficients  of  the  basis  spectrum. 

However,  this  original  ATC  algorithm,  as  proposed  by  Zelinskl  and  Noll, 
suffers  from  a  "burbling"  characteristic  at  lower  data  rates.  To  reduce 
this  distortion,  Tribolet  uses  Side  transmission  of  pitch  and  spectral  para¬ 
meters  obtained  by  Linear  Predictive  Coding  (LPC)5  analysis.  The  side  trans¬ 
mission  of  the  LPC  and  pitch  parameters  does  In  fact  remove  the  "burbling" 
sound  and  Improve  the  overall  signal -to-nolse  ratio.  Figure  2.2-2  describes 
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Figure  2.2-2:  Adaptive  Transform  Coder 


the  operation  of  this  ATC  digitizer. 

The  Innovative  solution  to  the  basis  spectrum  calculation  is  formed  from 
a  least-square  analysis  of  Xgft),  that  Is,  finding  those  predictor  coeffic¬ 
ients  which  minimize. 

(2.2-4) 


These  predictor  coefficients,  or  alternately  reflection  coefficients,  carry 
information  about  the  envelope  since: 

r(f)  -  FFT(a.j)  (2.2-5) 

and  the  envelope  is  then  Y~*(f). 

In  addition  to  linear  predictive  modeling  of  the  ATC  spectrum,  the  Trib- 
olet  approach  uses  a  pitch  excitation  source.  This  accounts  for  the  fine 
structure  in  the  short-time  spectrum,  which  is  consistent  with  the  known 
mechanisms  of  speech  production.  This  scheme  forces  the  assignment  of  trans¬ 
form  bits  to  many  pitch  striations  that  otherwise  would  not  be  transmitted 
at  all. 


With  reference  to  Figure  2.2-3,  the  ATC  analysis  is  described  as  follows: 


1.  The  Input  speech  (Figure  2. 2-3(a))  is  Fourier  transformed  toyield  a 
DCT  spectrum  (Figure  2.2-3(b)).  This  spectrum  is  squared,  windowed, 
and  inverse  Fourier  transformed  to  yeild  an  autocorrelation  function 
(i.e.,  pseudo-ACF)  of  the  reflected  speech  waveform.  The  first  P+1 
values  of  this  function  are  used  to  define  a  correlation  matrix  in 
the  usual  normal  equation  formulation  sense.  The  solution  of  these 
equations  (I.e.,  Levinson  recursion)  yields  a  prediction  filter  of 
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order  P.  The  Inverse  spectrum  of  this  filter  yields  a  smoothed 
estimate  of  the  DCT  (Figure  2.2-3 (c ) )  spectrum  levels  to  be  used 
in  the  adaptation  of  the  quantizers. 

2.  A  rudimentary  estimate  of  the  pitch  value,  M,  Is  found  in  the 
pseudo-ACF  after  the  second  zero  crossing  beyond  the  P+1  ACF 
value.  A  corresponding  gain  factor,  G,  is  also  computed  as  the 
ratio  of  ACF(M)/ACF(0).  With  these  two  parameters,  a  Ditch  pat¬ 
tern  Is  generated  In  the  frequency  domain  (Figure  2-2-3(d) )  and 
applied  congruently  with  the  LPC  spectrum.  This  combination, 
yielding  a  linear  prediction  spectral  fit  to  the  DCT  of  the  input 
speech,  is  called  the  basis  spectrun  (Figure  2.2-3(e)). 

3.  The  computation  to  determine  the  number  of  bits  to  allocate  for 
each  transform  then  proceeds  as  follows: 

Let  o.  be  the  amplitude  of  the  1th  term  of  the  envelope  of  the  basis 
spectrum.  The  B^ ,  the  number  of  bits  allocated  to  the  ith  cosine  transform 
coefficient.  Is  given  by: 


Bf/N  -  ( 1/2N) 


N 

l 

j=l 


log20j2 


+  1/2  log2o.j2 


(2.2-6) 


where 


Bf  =  the  total  number  of  bits  allocated  to  send  the  cosine  transform 
coefficients  per  frame 

N  =  the  total  number  of  cosine  transform  coefficients  calculated  per 
frame . 

Note  that  the  term  in  brackets  Is  calculated  once  per  frame.  Fairly  simple 
algorithms  ensure  that  B^  Is  an  Integer  value  and  that  the  sum  of  the  integer 
adds  to  B^., 

The  cosine  transform  coefficients  approximate  a  Gaussian  probability 

density  function.  Optimum  Gaussian  quantizers  derived  by  Max20  can  be  used  to 

encode  each  transform  coefficient  with  B^  bits.  Since  many  of  the  B^'s  will 

be  zero,  only  larger  coefficients  are  sent.  However,  GTE  Sylvania's  experience 

with  the  9600-b/s  ATC  has  shown  that  optimal  quantizers  can  be  developed  that 

more  closely  match  the  transform  distribution. 
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4.  The  receiver  uses  the  basis  spectrum  Information  (LPC,  M,  G)  to 
regenerate  the  DCT  envelope,  to  generate  the  bit  allocation  using 
Equation  (2.2-6) ,  to  decode  the  cosine  transform  coefficients  (Fig¬ 
ure  2. 2-3( f ) ) ,  and  then  to  take  the  inverse  cosine  transform  using 
the  FFT.  Frame  boundary  problems  exist  at  all  data  rates  since 
quantization  of  the  transform  coefficients  causes  the  regenerated 
waveform  to  be  slightly  different  than  the  original.  By  overlapping 
the  frames  slightly  and  by  Interpolating  across  the  frame  boundaries, 
these  discontinuities  can  be  smoothed. 


The  overall  quality  of  this  approach  can  be  estimated  from  Figure  2.2-3(g), 
which  shows  the  error  waveform  defined  as: 

e(n)  =  s(n)  -  s(n)  (2.2-7) 

The  received  waveform,  s(n),  has  a  high  signal -to-noise  ratio  (^17  dB)  for 
some  speakers,  even  for  erroneous  pitch  estimations  made  in  the  analyzer. 

In  fact,  GTE  Sylvania  has  demonstrated  through  audio  tapes  that  an  eighth- 
order  LPC  predictor  (P  =  8),  coupled  with  the  rudimentary  pitch  extractor 
(and  no  voiced/unvoiced  logic),  yields  consistently  high-quality  speech. 
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2.3  Optimization  and  Modification  of  the  ATC  System 

The  adaptive  transform  coding  scheme  shown  in  Figure  2.2-2  produces  high 
quality  synthesized  speech  above  9600  bps.  In  this  scheme,  the  quality 
in  objective  signal -to-noise  ratio  and  in  subjective  perceptual  effects 
degrades  by  lowering  the  transmission  data  rate. 

There  are  several  sources  which  reduce  the  voice  quality  but  solu¬ 
tions  for  most  of  these  problems  exist.  The  first  one  is  the  quantization 
noise  caused  by  coarse  quantization  of  the  DCT  coefficients  at  low  data 
rates.  This  problem  can  be  minimized  by  developing  optimal  quantizers  from 
the  distribution  of  actual  DCT  coefficients.  The  second  and  most  severe 
degradation  source  is  the  reduction  in  bandwidth  at  low  data  rates.  This 
effective  lowpass  filtering  stems  from  the  fact  that  only  large  DCT 
coefficients  are  being  coded  because  there  are  not  sufficient  bits  to 
send  the  smaller  coefficients.  The  effects  of  lowpass  filtering  can  be 
removed  by  the  addition  of  random  noise  to  the  low  energy  frequency 
band.  The  third  source  of  degradation  are  waveform  discontinuities  at 
the  frame  boundaries  since  the  DCT  coefficients  are  coarsely  quantized 
and  some  low  valued  coefficients  are  not  transmitted.  These  effects 
may  be  reduced  by  overlapping  the  frames  slightly  and  by  interpolating 
across  the  frame  boundaries. 

There  are  other  areas  for  improvement  in  the  ATC  sci.eme.  The  first 
one  is  the  trade-off  of  bits  allocated  to  DCT  coefficients  and  to  side 
information  within  given  transmission  data  rate.  Another  is  the  method 
for  quantizing  the  side  Information.  It  can  be  shown  that  closer  esti¬ 
mation  (in  the  mean  square  error  sense)  of  the  basis  function  to  actual 
DCT  coefficients  provides  better  performance  of  the  ATC  coder.  In  fact, 
the  extreme  case,  where  the  basis  spectrum  equals  the  actual  spectrum. 
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quantization  of  the  DCT  can  be  precise,  eliminating  lowpass  effects  or 
boundary  problems  as  long  as  the  sign  bits  of  DCT  coefficients  are  pro¬ 
vided  because  OCT  is  a  unitary  transform. 

In  the  following  sections,  the  modified  ATC  system  will  be  described. 
Also,  those  areas  requiring  further  developments  will  be  described  and 
the  possible  improvements  will  be  discussed. 

2.3.1  Description  of  the  Modified  ATC  System 

The  block  diagram  of  the  ATC  analyzer  and  synthesizer  are  shown  in 
Figure  2.3-1  and  Figure  2.3-2,  respectively.  In  this  scheme,  the  input  speech 
is  buffered  into  blocks  of  data  (v(n)}  which  consist  of  a  frame.  This 
frame  of  speech  data  is  overlapped  slightly  (about  10  samples)  in  order 
to  reduce  the  frame  boundary  problems.  The  mean  and  variance  of  the  input 
speech  signal  are  calculated  for  the  transformation  of  zero  mean  and  unit 
variance.  This  mean  and  variance  are  quantized  and  sent  to  the  receiver 
for  the  renormal ization  of  the  synthesized  speech.  The  Discrete  Cosine 
Transform  is  calculated  on  the  zero  mean  and  unit  variance  input.  The  DCT 
coefficients  are  then  adaptively  quantized  to  form  mainband  information 
and  transmitted  to  the  receiver.  At  the  receiver,  they  are  decoded  and 
inverse  transformed  to  reproduce  the  zero  mean  and  unit  variance  speech 
signal.  This  signal  is  renormalized  by  the  mean  and  variance  to  produce 
the  synthesized  speech. 

In  the  meantime,  the  mean  and  variance  of  the  signal  are  decoded 
and  dequantized  to  renormalize  the  inverse  transformed  signal.  In  order 
to  reduce  the  effects  of  signal  discontinuities  at  the  frame  boundaries, 
the  overlapped  signals  are  interpolated. 
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FIGURE  2.3-1  BLOCK  DIAGRAM  OF  THE  ATC  ANALYZER 


FIGURE  2.3-2  BLOCK  DIAGRAM  Of  THE  ATC  SYNTHESIZER 


The  adaptive  quantizations  and  dequantizations  of  this  scheme  are 
based  on  the  sideband  information  which  the  basis  spectrum  will  be  com¬ 
puted  from.  The  bit  assignments  and  step  size  computation  will  be  deter¬ 
mined  by  the  optimum  bit  assignments  rule  from  the  basis  spectrum.  The 
sideband  information  includes  the  pitch  gain  (PG),  pitch  number  (M), 
voiced/unvoiced  decision,  and  the  8  PARCOR  coefficients.  The  mean  and 
variance  of  the  input  speech  signals  are  also  included  in  the  sideband 
information.  The  data  of  the  sideband  and  mainband  are  encoded  by  the 
three  block  of  a  (63,45)  BCH  code  in  order  to  reduce  the  effects  of  the 
channel  errors.  The  channel  errors  occurring  during  the  transmission 
through  the  noisy  channel  will  be  corrected  by  the  decoder.  The  informa¬ 
tion  of  the  mainband  is  fed  to  the  synthesizer  to  reproduce  the  speech 
signal . 

2.3.2  Basis  Spectrum  of  the  ATC  System 

The  performance  of  the  ATC  system  is  heavily  dependent  on  the  gener¬ 
ation  of  the  basis  spectrum  from  which  the  adaptive  quantization  and  de¬ 
quantization  rule  is  derived.  Two  basic  adaptation  techniques  have  been 
proposed.  The  first  technique,  proposed  by  Zelinski  and  Noll  1,2  is 
described  in  Figure  2.3-3. 

After  the  calculation  of  DCT  coefficients,  the  basis  spectrum  is 
estimated  by  making  DCT  coefficients  positive  and  averaging  between  peaks 
to  compress  the  DCT  envelope.  The  amplitudes  of  the  every  mth  (m  is  typi¬ 
cally  8  to  16)  sample  of  the  envelope  are  quantized  and  sent  to  the  re¬ 
ceiver  to  represent  the  spectral  levels  at  specified  frequencies.  These  ampli¬ 
tudes  are  then  geometrically  interpolated  (i.e.,  linearly  interpolated 
in  log  amplitude)  to  form  the  basis  spectrum.  This  simple,  "non-speech 
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specific"  algorithm  is  quite  appropriate  for  speech  transmission  above 
9.6  Kbps.  However,  the  synthesized  signal  is  degraded  by  a  very  percepti¬ 
ble  "burbling"  distortion  as  the  data  rate  decreases  below  9.6  Kbps. 

Zel inski  and  Noll  2  suqgested  incorporating  a  form  of  voice-excited 
"fill-in"  procedure  similar  to  that  used  in  voice  excited  vocoder  tech¬ 
nique.  In  their  techinque,  low  energy  frequency  bands,  which  receive  no 
bits  for  encoding  at  the  transmitter,  are  filled-in  at  the  receiver  with 
random  noise  in  order  to  enhance  the  perceived  speech  quality.  Some 
improvements  have  been  reported,  but  the  addition  of  random  noise 
introduces  some  hoarseness  to  the  synthesized  speech.  They  adjust  the 
amount  of  added  random  noise  to  optimize  the  speech  quality,  i.e.,  the 
problems  of  "bubbl ing"  and  "hoarseness"  are  reduced,  but  it  is  not  suffi¬ 
cient  to  overcome  the  difficulties  aforementioned  at  data  rates  ijelQw 
9.6  Kbps.  Tribolet  and  Crochiere  *  proposed  a  more  appropriate  algo¬ 
rithm  for  bit  rates  below  9.6  Kb/s  which  is  a  "speech  specific,"  adaptation 
algorithm,  and  takes  full  advantage  of  the  known  models  and  dynamics  of 
the  speech  production  mechanism  in  order  to  predict  the  DCT  spectral 
levels.  This  algorithm  is  based  on  an  all  pole  model  of  the  formant  struc¬ 
ture  of  speech  and  a  pitch  model  to  represent  the  fine  structure  (pitch 
striations)  in  the  speech  spectrum  ,s  1“.  The  resulting  algorithm  is 
referred  to  as  a  "vocoder-driven"  adaptation  strategy  due  to  the  close 
relationship  of  this  spectral  estimate  to  a  vocoder  model. 

The  block  diagram  in  the  Figure  2.3-1  illustrates  the  implementation 
of  the  technique.  First  the  DCT  spectrum  is  squared  and  inverse  trans¬ 
formed  with  an  inverse  DFT.  This  yields  an  autocorrelation-like  function, 
the  pseudo-ACF  (Auto-Correlation  Function).  The  first  P  ♦  1  values  of 
this  function  are  used  to  define  a  correlation  matrix  in  the  usual  normal 
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equations  formulation  sense  13  .  The  solution  of  these  equations  yields 
an  LPC  filter  of  order  P.  The  inverse  spectrum,  illustrated  in  Figure 
2'.2-3(c)  yields  an  estimate  of  the  formant  structure  of  the  DCT  spectrum  denoted 
as  ot'k). 

The  fine  structure  of  the  DCT  spectrum  is  obtained  from  a  pitch  model. 
To  obtain  the  pitch  period,  M,  the  pseudo-ACF  is  searched  for  a  maximum. 

The  pitch  estimate  taken  from  the  rudimentary  procedure  suggested  by 
Tribolet  and  Crochiere  3,4  has  a  definite  bearing  on  the  SNR  of  the  processed 
speech.  The  use  of  this  imperfect  pitch  value  does  not  grossly  affect 
the  subjective  voice  quality.  However,  in  order  to  derive  the  most  im¬ 
pact  from  the  use  of  a  pitch  weighting  function,  the  original  pitch  ex¬ 
traction  procedure  has  been  modified.  The  flowchart  of  the  search  routine 
for  pitch  period,  M,  is  shown  in  Figure  2.3-4.  It  consists  of  a  simple 
search  routine  which  commences  after  the  appearance  of  the  second  zero 
crossing  in  the  autocorrelation  function.  The  pitch  contour  which  results 
from  this  technique  is  more  accurate  than  the  original  unconstrained 
approach  with  a  corresponding  increase  in  the  cumulative  SNR.  This 
simple  scheme  has  proven  to  be  adequate  for  the  development  of  the  ATC 
system  when  the  voice/unvoiced  decision  device  is  incorporated.  The 
corresponding  pitch  gain,  G,  is  the  ratio  of  the  pseudo-ACF  at  M  over 
its  value  at  the  origin.  With  these  two  parameters,  a  pitch  pattern 
Op(k)  is  generated  in  the  frequency  domain  as  illustrated  in  Figure  2. 2-3 ( d ) . 
The  two  spectral  components  at(k)  and  op(k)  are  multiplied  and  nor¬ 
malized  to  yield  the  final  spectral  estimate  for  o$  (k), 

(k)  =  ot(k)op(k)  k  =  0,  1,  2 .  N-l  (2.3-1) 


2-18 


GIVEN:  MP 

ACf(I),  I  ■  1,  N 


MP;  PREVIOUS  PITCH 


! 

I 

i 


This  estimate,  illustrated  by  Figure  2. 2-3(e)  is  then  used  for  the  bit  assign 
ment  and  step-size  adaptation  algorithms  as  seen  in  Figure  2.3-1. 

There  are  many  ways  of  generating  pitch  weighting  function  in  the 
frequency  domain.  In  the  model  of  GTE  Sylvania,  first  the  pitch  gain  is 
defined  as 


G  =  ACF(M)  /  ACF(O) 


(2.3-2) 


and  a  time  domain  pitch  impulse  train  with  exponentially  decaying  ampli¬ 
tudes  is  generated  as 


p(n) 


n  =  kM,  k  =  0,  1, 
otherwise 


K,  K  =  | N/M ( 


(2.3-3) 


where  N  is  the  number  of  speech  samples  in  a  frame  and  |-|  denotes  the 
largest  integer.  This  time  domain  signal  p(n)  is  transformed  into  a 
zero  mean  and  unit  power  process  Pi(n)  which  is  again  transformed  into  the 
frequency  domain  as 

N-l 

PiOO  =  E  Pi(n)  e"j  N  ,  k  =  0,  1 . N-l  (2.3-4) 

n=0 

This  periodic  pitch  weighting  function  Pj(k),  shown  in  Figure  2.3-5,  when 
multiplied  by  the  LPC  spectrum,  is  adequate  for  the  generation  of 
the  basis  function  in  many  cases.  However,  there  are  cases  in  which  the 
pitch  harmonics  are  not  well  preserved  in  the  high  frequency  band  for 
some  voiced  sounds,  particularly  for  the  fricative  voiced  sounds  (V,  Z). 

There  are  also  many  cases  where  the  pitch  harmonics  of  Pi(k)  are  not 
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FIGURF  2.3-5  GENERATION  OF  THE  PITCH  WEIGHTING  FUNCTION 


matched  to  the  actual  spectrum  of  the  input  speech,  particularly  in  the  high 
frequency  band.  This  fact  can  be  explained  from  the  errors  of  the  pitch  period  in 
time  domain.  Most'gro'ss  errors  of  the  pitch  period  are  caused  by  erroneous 
decisions  of  the  pitch  detection  routine.  However,  a  small  amount  of  erro¬ 
neous  pitch  period”  estimates' may  always  exist  because  of  the  discrete  sam¬ 
pling  process  in  the  time  domain. 

Therefore,  we  generated  a  pitch  weighting  function  which  is  periodic  in 
the  low  frequency  band  and  close  to  unit  amplitude  in  high  frequency  band. 

One  such  function  can  be  generated  as 

P2(k)  =  Pi(k)  Wi(k)  +  W2(k)  (2.3-5) 

where  the  weighting  functions  Wi(k)  and  W2 ( k )  are  shown  in  Figure  2.3-5. 

The  pitch  weighting  function  P2(k),  of  equation  (2.3-5) ,  when  it  is  multi¬ 
plied  by  the  LPC  spectrum,  has  proven  to  be  an  efficient  for  the  estimation 
of  the  basis  function. 

The  basis  function  of  the  ATC  system  will  be  the  LPC  spectrum  in 
eq.  (2.3-1)  with  or  without  multiplication  of  the  pitch  weighting  function. 
Experiments  have  shown  that  a  closer  estimation  (in  the  mean  square 
error  sense)  of  the  basis  function  to  the  actual  spectrum  makes  the 
ATC  system  perform  better.  In  fact,  in  the  extreme  case,  where  the 
basis  spectrum  equals  the  DCT  spectrum,  quantization  of  the  DCT  parameter 
can  be  precise,  eliminating  all  types  of  distortion  as  long  as  the  sign 
bits  of  the  DCT  coefficients  are  provided.  A  post  V/UV  decision  is  made 
on  the  basis  of  the  signal -to-noise  ratios  in  frequency  domain  with  or 
without  multiplication  of  the  pitch  weighting  function,  l.e..  If  the 
signal-to-noise  ratio  of  the  basis  function  wfthout  multiplication  of  the 
pitch  weighting  function  provides  higher  value  than  the  one  with  pitch 
weighting  function,  it  is  better  tu  make  that  frame  as  unvoiced. 
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2.3.3  Bit  Assignments  Rule  of  the  ATC  System 

It  has  been  shown  that  the  basis  function  of  the  ATC  system  plays  an 
important  role  on  the  performance  of  the  ATC  system.  The  choice  of 
bit  assignments  also  determines  how  accurately  the  DCT  coefficients 
are  encoded.  Thus,  it  controls  the  distribution  of  the  quantizing  noise 
in  the  frequency  domain.  The  optimum  bit  assignements  rule  (in  the  mini¬ 
mum  mean  square  error  criterion)  for  a  stationary  Gaussian-  Markov  process 
has  been  derived  in  eq.  (2.2-6)  from  the  rate  distortion  theory. 15,16  It  can 
be  shown  that  the  optimum  bit  assignments  rule  based  on  a  minimum  mean 
square  error  leads  to  a  flat  noise  distribution  in  the  frequency  domain. 

It  has  been  known  that  a  flat  noise  in  frequency  domain  is  not  the  most 
desirable  perceptual  criterion.  Tribolet  and  Crochiere  4  modified 
the  bit  assignment  rule  of  eq.  (2.2-6)  by  multiplying  a  weighting  function 
W(k)  that  weights  the  importance  of  the  noise  in  different  frequency 
bands.  They  have  suggested  the  weighting  function  to  be 

W(k)  =  os2Y(k),  k  =  0,  1 . N-I  (2.3-6) 

where  y  is  a  parameter  that  can  be  experimentally  varied  from  -1  to  0.  So 
the  value  of  y  is  slowly  varied  between  these  two  extremes  (-1  <  y  <  0), 
the  noise  spectrum  will  evolve  from  a  flat  distribution  to  the  one  that 
precisely  follows  the  speech  spectrum.  Extensive  experiments  17,18,19 
of  noise  shaping  have  shown  that  the  noise  spectrum  which  follows  the 
spectrum  of  the  speech  in  certain  ways  provides  slightly  higher  subjective 
speech  quality  than  the  one  of  flat  noise  spectrum  does.  The  value  of 
y  =-0.125  was  reported  to  give  a  good  result  4  .  However,  when  the  data 
rate  decreases  below  8  kb/s,  the  effects  of  the  weighting  function  as  well 
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as  the  optimum  bit  assignments  rule  cannot  be  described  clearly.  The  per¬ 
formance  of  the  ATC  system  may  be  optimized  asymtotically  at  the  low  data 
rates  by -incorporating  a  simple  limiter  of  the  highest  bit  allocation.  By 
adjusting  the  largest  number  of  the  bit  allocation,  the  spectrums  of  the 
noise  can  be  varied.  The  spectrum  of  noise  will  be  flat  for  the  large 
value  of  the  limiter  output,  and  the  spectrum  of  the  noise  will  follow  the 
speech  spectrum  when  the  value  of  the  limiter  output  is  small  (?  1).  Ex¬ 
periments  have  shown  that  the  maximum  number  of  bit  assignments  of  5  pro¬ 
vides  a  good  result  at  the  data  rate  9.6  Kb/s. 

2.3.4  Quantization  of  Sideband  and  Mainband  Information 

The  quantization  effects  at  high  transmission  data  rates  do  not  cause 
a  perceptual  loss  of  performance  of  the  ATC  system.  However,  at  low  data 
rates,  these  quantization  effects  constitute  a  major  source  of  degradation 
of  the  synthesized  speech. 

Let  be  the  jth  actual  spectrum  and  P*.  be  the  jth  spectrum  from 
the  side  information.  Then,  the  normalized  DCT  coefficient  can  be  ex¬ 
pressed  as 


■  pj/pSj  <2-3~7> 

Let  B,  be  the  number  of  bits  assigned  to  jth  DCT  coefficient.  Then  P.  is 

J  J 

a  Gaussian  distributed  random  variable  if  the  samples  of  time  domain  signals 
are  Gaussian  distributed.  The  distribution  of  the  normalized  DCT  coeffi¬ 
cients  is,  however,  not  known  and  analytical  derivation  of  this  distri¬ 
bution  function  is  too  complicated  to  calculate.  GTE  Syl vania  performed 
simulations  to  develop  the  distribution  function  and  the  results  are  shown 
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in  Figure  2.3-6.  Note  that  the  distribution  function  of  Xj  lies  in  between 
the  Gaussian  and  Laplace  distribution.  GTE  Sylvania  has  written  a  computer 
program  which  calculates  the  characteristics  of  an  optimum  quantizer  from 
the  simulated  distribution  function  by  following  the  procedures  of  Max20. 

Let  )L  be  the  quantized  value  of  x,,  then  the  procedure  of  Max  mini- 

vf  0 

mizes  the  mean  square  error,  i.e., 

e=E|(^j-xj)2|  (2.3-8) 

where  E  denotes  the  statistical  expectation.  GTE  Syl vania  has  determined 
the  distribution  of  x,  under  the  given  conditions  of  B,  which  is  a  function 

J  J 

of  the  aTC  coder  and  speech  signals.  The  conditional  distribution  of  x^ 
is  slightly  different  from  the  distribution  of  Figure  2.3-6.  GTE  Syl  vania 
has  developed  a  computer  program  which  generates  an  optimum  quantizer  from 
the  conditional  distribution  of  x-.  These  procedures  can  be  applied  to 

J 

develop  quantizing  tables  for  every  system  parameter  where  the  minimum 
mean  square  error  criterion  is  an  adequate  measure. 

In  the  present  ATC  scheme,  the  LPC  technique  is  used  to  calculate  the 
basis  spectrum  with  transmission  requiring  quantization  of  the  PARCOR 
coefficients.  In  this  case,  the  error  criterion  ofeq.  (2.3-8)  is  modified  as 

E  =  E | s( x j )  (Xj  -  Xj)2|  (2.3-9) 

where  s{  )  is  the  weight  function  derived  from  the  sensitivity  analysis 
of  power  spectrum  with  respect  to  PARCOR  coefficient  variation  21 .  A 
large  data  base  was  used  to  accumulate  the  statistical  information  needed 
for  optimal  quantization  of  the  PARCOR  parameters. 
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The  quantizer  for  each  variable  when  optimized  with  the  proper  statis¬ 


tical  error  criterion,  appears  adequate  for  the  ATC  system  over  a  variety 
of  different  speakers  and  acoustics  noise  conditions  and  data  rates. 

2.3.5  Reducing  the  Effects  of  Lowpass  Filtering 

The  quality  of  the  ATC  coder  degrades  as  the  transmission  data  rate 
decreases.  One  of  the  major  sources  of  this  degradation  is  the  low  pass 
filtering  effect  which  can  be  explained  by  examining  Figure  2.3-7.  This 
figure  shows  that  no  DCT  coefficient  is  transmitted  in  frequency  band  2. 
However,  Figure  2.3-7  illustrates  that  the  basis  spectrum  which  is  derived 
from  LPC  techniques  closely  follows  the  actual  spectrum.  The  DCT  coeffi¬ 
cients  of  frequency  band  1  are  quantized  from  the  LPC  basis  spectrum, 
where  the  phase  (sign  in  this  case)  and  amplitude  information  are  modified 
from  the  LPC  basis  spectrum.  This  change  results  in  an  improvement  over  the 
LPC  technique.  In  frequency  band  2,  the  LPC  basis  spectrum  cannot  be  modified 
since  no  bits  are  assigned.  Zel inski  and  Noll  who  use  a  different 
basis  spectrum,  substitute  the  DCT  coefficients  in  this  frequency  band  2 
with  the  noise  samples  whose  variances  are  derived  from  the  side  infor¬ 
mation.  Some  improvements  have  been  reported  at  low  data  rates. 

This  technique  perceptually  adds  some  bandwidth  to  the  ATC  system, 
but  introduces  some  hoarseness  to  the  speech.  This  hoarseness  arises  from 
the  destruction  of  pitch  harmonics  in  the  frequency  domain,  and  can  be 
reduced  by  using  the  LPC  basis  spectrum  modified  by  the  pitch  weighting 
function  of  eq.  (2.3-5).  This  basis  function  is  shown  in  Figure  2.3-7 
with  symbol.  The  optimized  ATC  coder  with  the  "fill  in"  procedure 
produces  high  quality  synthesized  speech  above  the  data  rate  7200  b/s. 
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2.3.6  Bit  Allocations  to  Sideband  and  Mainband  : 

The  performance  of  the  ATC  system  depends  on  the  several  system  de¬ 
vices,  including  quantizer  characteristics,  bit  assignment  rules, 

* 

methods  of  estimating  the  basis  function,  bit  allocations  to  the  side¬ 
band  and  mainbands,  etc.  It  has  been  shown  that  the  estimation  of  the 
basis  function  plays  an  important  role  on  the  performance  of  the  ATC  system. 
However,  it  is  desirable  to  allocate  fewer  bits  for  the  generation  of 
the  basis  function  so  that  more  bits  remain  for  encoding  DCT  coefficients. 
Thus,  tradeoff  analyses  were  conducted  in  the  area  of  DCT  coefficient  and 
LPC  parameter  quantization. 

First,  the  performance  of  the  ATC  system  was  measured  with  the 
basis  spectrums  estimated  by  10th  order  and  8th  order  LPC  process  (no 
quantization  is  applied  to  the  PARCOR  parameters).  Both  SNR  measurements 
and  informal  listening  tests  have  shown  no  significant  differences.  This 
is  an  important  finding,  since  the  quantization  of  this  sideband  informa¬ 
tion  consumes  a  fair  amount  of  the  available  data  rate.  Any  conser¬ 
vation  of  bits  in  the  LPC  process  can  be  used  to  improve  or  protect  the 
transmission  of  the  OCT  coefficients. 

Second,  a  large  data  base,  comprised  of  15  male  and  female  speakers 
totalling  30,000  frames,  was  used  to  create  relative  frequency  histo- 
grams  for  each  PARCOR  coefficient.  The  probability  density  functions 
were  derived  from  this  data  and  used  with  the  technique  described  in 
section  2.3.4  to  develop  optimal  quantizers. 

The  bit  allocations  strategies  for  the  sideband  information  are 
shown  in  Table  2.3-1.  Combining  all  the  sideband  information  parameters, 
the  total  data  rate  is  in  the  range  of  35  _<  bits/frame  <  45.  Since  it  is 
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Parameter 

#  of  bits/frame 

PARCOR  § 

§  of  Bits  Tested 

#  of  bits  Decided 

1 

4,  5 

5 

2 

3,  4,  5 

5 

3 

3,  4 

4 

4 

3,  4 

4 

5 

2,  3 

3 

6 

2,  3 

3 

7 

2,  3 

2 

8 

2,  3 

2 

pitch,  M 

6 

6 

pitch  gain,  G 

2,  3 

2 

variance 

5 

5 

sync 

1 

1 

V/UV 

1 

1 

TABLE  2.3-1  BIT  ALLOCATIONS  TO  THE  SIDEBAND  INFORMATION 


necessary  to  allocate  bits  for  the  error  correcting  code  in  order  to  re¬ 
duce  the  effects  of  channel  errors,  the  data  rate  may  be  increased  to 
85  <  bits/frame  <  95.  By  allocating  7200  bits/second  for  the  encoding  of 
OCT  coefficients,  there  are  2400  b/s  left  for  the  sideband  information. 

This  limits  the  frame  updating  rate  to  less  than  30  frames/second  which 
forces  the  frame  size  of  256  samples  with  a  6400  Hz  sampling  rate.  In 
order  to  reduce  the  effects  of  the  signal  discontinuities  at  the  frame 
boundaries,  the  frames  are  overlapped  slightly  (10  samples).  Therefore, 
there  are  369  bits  per  frame  (246  samples)  with  6400  sampling  rate  for 
the  9.6  Kb/s  ATC  system. 

With  the  above  constraints,  the  performance  of  the  9.6  Kbps  ATC  sys¬ 
tem  was  evaluated  with  various  combinations  of  bit  allocations  to 
the  sideband  information  parameters.  As  a  result,  the  combination  of  the 
bits  sequence  shown  in  Table2. 3-1  was  determined  tobe  optimal  with  respect  to  objec¬ 
tive  measurements  (SNR).  The  performance  of  the  ATC  system  is  plotted 
with  respect  to  the  data  rate  in  Figure2.3-8  with  the  sideband  data  rate 
shown  in  Table  2.3-1.  The  figure  shows  that  the  decision  on  the  sideband 
data  rate  is  adequate  for  the  ATC  system  of  6800  b/s  ~  9600  b/s,  since 
its  performance  is  not  sensitive  to  the  changes  of  the  data  rate. 


2.3.7  Reducing  Discontinuities  at  the  Frame  Boundary 

The  adaptive  transform  coding  scheme  of  Figure2.2-2  produces  noise-like 
"burbling"  and  "click"  sounds  at  low  data  rates.  This  noise  is  generated 
at  the  frame  boundaries  by  waveshape  discontinuities  in  the  time  domain. 

The  noise  generated  from  these  discontinuities  cannot  be  entirely  eliminated, 
but  can  be  reduced  by  overlapping  frames  slightly  and  by  interpolating  across 
the  frame  boundaries. 
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The  frame  size  of  the  ATC  system  may  be  chosen  as  128  samples/frame 
in  the  previous  section.  However,  the  frame  size  was  increased  to  256 
samples  to  reduce  the  effects  of  signal  discontinuities  at  the  frame 
boundaries.  The  FORTRAN  simulations  of  the  ATC  system  at  frame  sizes  of 
128  and  256  samples  revealed  two  beneficial  findings.  First,  a  larger 
frame  size  does  not  adversely  affect  the  signal-to-noise  ratio  (SNR)  but 
noticeably  improves  the  subjective  voice  quality.  Second,  a  larger  frame 
size  with  pitch  weighting  is  better  than  a  smaller  frame  size  with  pitch 
weighting.  Both  these  findings  can  be  explained  rather  simply.  The  short 
term  speech  spectrum  may  not  be  stationary  for  a  large  frame  size  (246  sam¬ 
ples),  which  may  cause  the  synthesized  spectrum  to  be  smoothed  more  than 
it  should  be.  However,  since  the  frames  are  updated  half  as  often,  there 
are  half  as  many  frame  discontinuities.  In  the  ATC  system,  the  frame  dis¬ 
continuities  are  the  most  obvious  distortion  and  are  lessened  significantly 
with  the  256  sample  frame  size.  As  the  frame  size  increases,  the  resolu¬ 
tion  of  the  FFT  increases  as  well  due  to  an  increase  in  the  FFT  order  (N), 
i.e. , 


BW 

frequency  resolution  =  —  ,  BW  s  signal  bandwidth  (2.3-10) 

N 

A  finer  frequency  resol tuion  of  the  pitch  weighting  spectrum  places  more 
striations  in  the  LPC  basis  spectrum.  Hence,  a  more  detailed  sampling  of 
the  spectrum  is  achieved  for  larger  frame  sizes. 
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2.4  ATC  System  in  the  Presence  of  Random  Channel  Errors 

Our  FORTRAN  simulations  have  shown  that  the  ATC  system  produces  high 
quality  synthesized  speech  at  9600  bps  if  no  channel  errors  are  present. 
Error-free  transmission,  however,  is  not  always  possibl e  since  the  transmi tted 
signals  are  often  affected  by  channel  characteristics  and  by  noise  which 
may  or  may  not  vary  in  time.  Additive  Gaussian  noise  is  the  main  source 
of  signal  corruption  in  many  digital  data  transmission  systems  that  will 
introduce  random  channel  errors  (error  positions  are  independent  of  time). 
These  channel  errors  may  degrade  speech  quality,  and  the  degradation  of 
speech  may  depend  on  the  positions  of  these  channel  errors. 

In  the  following  sections,  the  design  of  the  ATC  system  will  be  exam¬ 
ined  and  changed  to  optimize  the  performance  of  the  system  under  the  in¬ 
fluence  of  random  channel  errors  ranging  from  a  bit  error  rate  (BER)  of  0 
to  10'2.  First,  the  effects  of  random  channel  errors  on  the  performance 
of  the  ATC  system  will  be  examined  in  section  2.4.1.  Afterwards,  the  per¬ 
formance  of  the  ATC  system,  as  a  function  of  data  rate  and  channel  error 
rate,  will  be  provided  in  section  2.4.2.  Then  forward  error  correcting 
codes  will  be  employed  to  reduce  the  effects  of  channel  errors.  The 
application  of  BCH  codes,  which  are  presently  the  most  powerful  random- 
error-correcting  codes,  will  be  presented  in  section  2.4.3.  The  perfor¬ 
mance  of  the  ATC  system  is  sensitive  to  the  errors  in  the  sideband  in¬ 
formation,  since  the  bit  assignments  of  the  OCT  coefficients  depend  on 
the  sideband  information.  Some  DCT  coefficients  are  more  important  than 
the  others  in  the  sense  of  maintaining  the  system's  performance  with  no 
channel  errors.  The  selection  and  protection  of  the  important  bits  in 
ATC  system  were  made  by  analyzing  the  performance  of  the  ATC  system  with 
various  bits  protected.  Then,  in  section  2.4.4,  we  selected  parameters 
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of  BCH  code  used  to  protect  these  bits.  Finally,  the  conclusions  are  sum¬ 
marized  in  section  2.4.5. 

2.4.1  The  Effects  of  Random  Channel  Errors  on  the  Performance  of  the  ATC 
System  at  9.6  Kbps 

The  ATC  algorithm  was  originally  designed  for  the  error-free  channel. 

A  noisy  channel,  however,  will  introduce  errors  in  the  received  bits.  The 
performance  of  the  ATC  coder  given  by  its  signal -to-quantization  noise 
ratio  (SNR)  is  plotted  in  Figure  2.4-1  with  respect  to  the  channel  error 
rate.  These  plots  are  obtained  by  evaluating  various  types  of  speech 
totally  about  30  sec. 

Degradation  of  the  synthesized  speech  due  to  the  channel  errors  was 
not  noticed  in  the  informal  listening  tests  when  the  channel  error  rate 
was  lower  than  10"*.  However,  the  quality  of  the  speech  as  well  as  the 
SNR  drops  rapidly  when  the  channel  error  rate  is  higher  than  10-3.  It  is, 
therefore,  desirable  to  protect  the  system  performance  at  the  higher  chan¬ 
nel  error  rates;  { >10" 3 ) - 


2.4.2  Tradeoff  Analysis  Between  Data  Rate  and  Channel  Error  Rate 

The  performance  of  the  ATC  system  ms  evaluated  in  terms  of  the 
SNR  with  the  transmission  data  rate  varying  from  7700  to  9600  bps.  The 
results  are  plotted  in  Figure  2.4-2  together  with  the  performance  of  the 
ATC  system  under  the  influence  of  random  channel  errors.  As  it  is  noted 
from  this  Figure,  the  performance  of  the  system  does  not  drop  rapidly  as 
the  transmission  data  rate  decreases,  while  the  performance  of  the  system 
drops  rapidly  when  the  channel  error  rate  is  higher  than  10-3.  Since 


i 
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some  channel  errors  can  be  corrected  by  utilizing  error  correcting  codes, 
the  error  rate  of  the  information  bits  can  be  reduced  depending  on  the 
channel  characteristics  and  the  selection  of  error  correcting  codes. 
However,  additional  information  (parity  bits)  has  to  be  sent  to  the  re¬ 
ceiver  to  correct  channel  errors.  Therefore,  reducing  the  channel  errors 
requires  reducing  the  source  information  rate.  Figure  2.4-2  shows  that 
the  performance  of  the  ATC  system  can  be  improved  by  using  the  error  cor¬ 
recting  codes  when  the  channel  error  rate  is  higher  than  10"3  since  the 
performance  of  the  system  drops  slowly  when  the  source  information  rate 
decreases.  However,  error  correcting  codes  are  not  advisable  to  reduce 
the  effects  of  channel  errors  when  the  error  rate  is  lower  than  10" 3 
since  the  performance  of  the  system  does  not  drop  rapidly  as  the  channel 
error  rate  increases. 

2.4.3  Application  of  BCH  Code 

There  are  many  ways  of  utilizing  error  correction  codes  to  reduce  the 
effects  of  channel  errors.  The  method  of  correcting  errors  depends  on 
the  application,  i.e.,  data  rate,  channel  error  rate,  complexity,  cost, 
etc.  Since  the  ATC  coder  is  designed  for  real-time  implementation,  error 
correcting  code  must  not  require  a  large  time  delay  for  correcting  chan¬ 
nel  errors.  Block  codes  of  short  length  are  suited  to  the  real-time  im¬ 
plementation  of  ATC  algorithm.  These  codes  require  no  additional  time 
delay  to  process  the  error  correcting  algorithm  if  the  length  of  the 
block  code  is  less  than  the  number  of  bits  received  in  a  frame  period. 
There  are  many  types  of  block  codes  that  can  be  properly  used  depending 
on  the  channel  characteristics. 
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Most  real  communication  channels  corrupt  signals  in  many  ways.  The 
signals  may  be  corrupted  by  the  additive  Gaussian  noise  and/or  impulsive 
noise  that  produce  random  and  burst  errors,  respectively.  In  other  situ¬ 
ations,  the  characteristics  of  the  channel  may  vary  in  time  (fading 
channel)  or  the  channel  may  be  selected  at  random  from  one  ensemble  of 
channels  with  widely  different  characteristics  such  as  the  switched  tele¬ 
phone  network.  It  is  very  hard  to  construct  an  error-control  system 
which  adapts  to  various  types  of  channels.  Since  additive  Gaussian  noise 
is  the  main  source  that  corrupt  signals  in  many  practical  comnuni cation 
channels,  the  most  practical  error  correcting  code  is  the  one  which  is 
capable  of  correcting  random  errors. 

The  Base-Chaudhuri-Hocquenhem  (BCH)  codes,  that  are  a  generaliza¬ 
tion  of  Haimina  codes  for  correctina  multiDle  errors.  They  are 
well  known  to  be  the  most  powerful  random-error  correcting  codes,  and 
a  decoding  algorithm  that  can  be  implemented  with  a  reasonable  amount  of 
complexity  has  been  devised  for  these  codes.10’  12  A  more  funda¬ 
mental  description  of  the  BCH  codes  and  their  encoding,  decoding  algorithm 
are  given  in  Appendix  A.  This  appendix  shows  that  with  the  block  lenqth 
of  the  code  n=2n_1  and  mt  parity  checks,  it  is  possible  to  correct 
any  t  or  less  errors  in  a  primitive  (n,k)  BCH  code  where  k  is  the 
number  of  information  bits.  The  proper  choice  of  m,  n,  t  in  primitive 
BCH  code  depends  on  the  channel  error  rate,  data  rate,  and  the  system's 
specifications.  For  the  real-time  implementation  of  ATC  coder,  the  values 
of  t=3  and  m=6,  7,  8  are  considered  as  reasonable  choices. 

The  performance  of  random-error  correcting  BCH  codes  is  expressed 
in  terms  of  error-probability.  Let  P(m,n)  be  the  probability  of  m  errors 
occurring  in  an  n-bit  block  and  denote  the  probability  of  decoding  an 
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error  pattern  of  weight  m  correctly,  then  the  probability  of  decoding  the 
received  code  word  erroneously  may  be  expressed  as 


n 

P  =  1  -  I  8  P(m,n) 
m=0  m 

(2.4-1) 

=  Jn  P(m,n) 

m=0 


where  am  =  1-^  denotes  the  probability  of  erroneously  decoding  an  error 
pattern  of  weight  m.  The  parameter  am  is  a  function  of  the  code  and  de¬ 
coding  algorithm.  If  a  t  error-correcting  BCH  code  is  employed  and  it 
is  decoded  with  the  Peterson  decoding  algorithm  shown  in  Appendix  A,  the 
parameter  may  be  expressed  as 

a  [=  0  0  <  m  <  t  (2.4-2) 

ml  — 

1=1  t  <  m  <  n 


and  the  probability  of  erroneously  decoding  the  code  word  may  be  reduced 
from  eq.  (2.4-1)  as 


n 

P  =  I  p(m,n)  (2.4-3) 

m=t+l 


If  the  bit  errors  occur  independently  and  at  random  with  probability  e, 
then  the  probabil ity  p(m,n)  can  be  expressed  as 


p(m,n) 


|nj  em(l-e)n'm 


(2.4-4) 


where  the  probability  p(m,n)  is  simply  the  binomial  distribution  and  Pg 
in  eq.  (2.4-3)  is  simply  the  tail  of  the  distribution. 
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Let  the  channel  error  rate  e  =  10-2  which  is  specified  by  the  con¬ 
tract.  Let  the  block  length  of  the  BCH  code  n  =  127  and  t  =  3,  then  the 


probability  of  error  occurring  in  the  block  can  be  written  as 
127 

Pe  =  £  p(m,127) 
m=4 

=  1  -  p(0,l27)  -  p(l,l27)  -  p( 2 , 127 )  -  p(3,127) 

=  0.0393-  (2.4-5) 

In  this  BCH  code,  the  information  rate  may  be  expressed  as 

R  =  k/n 

=  106/127 

=  0.8346  (2.4-6) 


where  16.54%  of  the  data  is  used  for  the  redundant  parity  checks.  The 
information  rate  for  the  (127,106)  BCH  code  from  eq.  (2.4-6)  is  a  high 
83.46%.  However,  the  probability  of  error  occurring  in  the  block  is 
also  high  since  from  eq.  (2.4-5),  it  is  expected  to  have  one  block  in 
error  for  each  25  blocks.  Let  n  =  63,  k  =  45,  t  =  3  ((63,45)  BCH  Code), 
then  the  probabil ity  of  error  occurring  in  the  block  of  63  bits  will  be 

63 

Pe  3  L  p(m,63) 
m=4 

=  1  -  p(0,63)  -  p( 1,63)  -  p(2,63)  -  p(3-63) 

=  0.003725  (2.4-7) 
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The  information  rate  R  is  about  71.42%,  which  is  lower  than  one  of  the 
(127,106)  6CH  code,  but  one  erroneous  block  is  expected  out  of  268  blocks, 
which  turns  out  to  be  a  proper  choice  in  the  following  section. 

2.4.4  Selection  and  Protection,  of  the  Important  Bits  in  the  ATC  System 
The  performance  of  the  ATC  system  was  evaluated  under  various  simu¬ 
lated  channel  error  rates  (random  errors)  to  see  the  effects  of 
channel  errors.  Two  independent  error  generators  were  used  on  separate 
regions  of  the  bit  allocation  strategies  to  isolate  the  most 
sensitive  bits  out  of  the  9600  bps  system.  One  error  rate,  error  rate 
A,  was  applied  to  the  bit  stream  of  the  DCT  coefficients.  The  other 
error  rate,  error  rate  B,  was  applied  to  the  bits  allocated  to  the  side¬ 
band  information  parameters.  The  results  are  plotted  in  Figure  2.4-3. 

In  this  figure,  the  cumulative  SNR  does  not  degrade  more  than  9  dB  when 
the  channel  errors  are  applied  to  every  bit  at  the  rate  10"2.  Although 
the  processed  sentence  is  generally  intelligible,  there  are  periods  when 
concentrated  errors  lead  to  undesirable  distortions  (pops,  clicks,  etc.). 

On  the  other  hand,  a  few  bits  of  protection  on  the  sideband  information 
(42  bits/ frame  in  a  369  bits/ frame)  lead  to  the  improvement  of  SNR  by 
4.2  dB.  Therefore,  protection  of  the  sideband  information  from  the 
channel  errors  is  necessary  to  minimize  the  reduction  in  SNR,  since  the 
performance  of  the  ATC  system  is  very  sensitive  to  the  errors  in  the  side¬ 
band  information. 

To  further  improve  the  system's  performance  at  error  rates  less  than  10  2, 
a  primitive  BCH  code  was  appl  ied  for  the  partial  protection  of  the  bits  related  to  the 
DCT  coefficients  as  well  as  for  the  protection  of  sideband  information.  The  protection 
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of  all  information  is  not  considered  practical  because  of  software  re¬ 
quirements  (computation  time  and  program  size),  and  as  Figure  2.4'5  shows, 
the  protection  of  all  information  does  not  lead  to  the  highest  system 
performance  at  the  error  rates  below  10-2. 

In  tne  early  simulations  of  ATC  system  with  channel  errors  by 
Zel inski  and  Noll,1*  2  no  channel  errors  are  applied  to  the  sideband  in¬ 
formation  or  to  the  most  significant  bit  of  each  DCT  coefficient.  The 
performance  of  the  system  was  shown  to  be  insensitive  to  the  changes  of 
channel  error  rates  up  to  5%.  Although  the  ATC  system  has  been  modified, 
the  protection  of  the  most  significant  bit  from  each  DCT  coefficient  may 
lead  to  a  good  selection  of  important  bits.  We  modified  this  scheme  shown  in 
Figure  2.4-4  (diagonal  protections) .  In  this  figure,  the  quantized  DCT 
coefficients  (not  the  value  of  the  quantized  DCT,  but  the  decimal  number 
or  address  of  the  quantization  tables  which  is  ready  to  serialize  for  the 
transmission  through  the  channel)  are  ordered  in  descending  magnitude.  The 
information  on  magnitude  and  the  number  of  bits  for  each  DCT  is  obtained 

from  the  sideband  information.  In  the  "diagonal  protections,"  the  bits 

\ 

to  be  protected  are  selected  in  the  sequence  of  l-*6-*-10-*-14-*-17-*  up  to 
the  desired  number  of  bits.  Another  way  of  selecting  important  bits  is 
the  technique  of  "horizontal  protections"  where  the  bits  are  selected 
for  protection  in  the  horizontal  direction  in  Figure  2.4-4  as  l-*-2-+6-*TO*-3-*7-*- 
up  to  the  desired  number  of  bits.  In  order  to  select  the  number  of  bits 
to  be  protected,  two  block  lengths  of  BCH  codes  (i.e.,  (63,45)  and  127,106) 
BCH  code)  have  been  applied  to  the  ATC  system  with  the  selections  of  im¬ 
portant  bits  described  as  "horizontal  protections"  and  "diagonal  protec¬ 
tions."  The  number  of  bits  to  be  protected  as  well  as  the  performance 
of  the  4TC  system  are  tabulated  in  Table  2.4-1  at  the  channel  error  rate 
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0  and  10?.  As  it  is  noted  from  the  table,  "horizontal  protection"  per¬ 
forms  better  than  "diagonal  protection"  by  1.3  dB  when  two  blocks  of  a 
(127,106)  BCH  code  are  applied  to  the  ATC  system  at  the  channel  error 
rate  10"?.  Another  observation  is  that  the  (127,106)  BCH  code  improves 
performs  .^e  over  a  (63,45)  BCH  code  by  1.1  dB  in  "horizontal  protection" 
at  the  channel  error  rate  10"2.  However,  informal  listening  tests  indi¬ 
cate  that  there  are  periods  that  contain  a  large  amount  of  distortions 
(pops,  clicks,  etc.)  which  lead  to  major  objectionable  speech  degradation 
at  9600  bps.  The  main  source  of  this  degradation  are  the  frames  of 
speech  that  have  more  than  3  errors  which  cannot  be  corrected  by  the 
system.  The  probability  of  more  than  3  errors  occurring  in  a  block  is 
0.039  from  eq.  (2.4-4)  when  the  (127,106)  BCH  code  is  employed  for  the 
chanrel  error  corrections  of  rate  1CT2.  This  probability  is  reduced  to 
0.0037  when  the  (63,45)  BCH  code  is  incorporated.  Thus,  to  reduce  the 
number  of  frames  that  contain  a  large  amount  of  distortions,  one  should 
use  the  (63,45)  BCH  code  rather  than  the  (127,106)  BCH  code  when  the 
channel  error  rate  is  10"2. 

Finally,  the  Table  2.4-1  indicates  that  it  may  be  better  to  protect 
more  than  90  bits  out  of  a  frame  (369  bits/frame)  to  increase  the  SNR  at 
the  error  rate  10-2.  To  protect  more  bits,  one  must  use  more  parity  bits 
which  reduces  the  number  of  bits  for  encoding  speech  signals.  The  trade¬ 
off  analysis  between  the  number  of  parity  bits  and  error  rates  on  the 
performance  of  the  ATC  system  was  investigated  by  using  a  (63,45)  BCH 
code  coupled  with  selecting  the  important  bits  by  the  technique  o* 
"horizontal  protection." 

To  find  out  the  best  number  of  bits  to  be  protected,  the  performances 
of  the  ATC  system  were  evaluated  at  the  several  different  channel 


error  rates  by  varying  the  number  of  blocks  for  (63,45)  BCH  code 
in  a  frame.  The  results  are  tabulated  in  Table  2.4-2.  The  performance 
of  the  system  is  plotted  in  Figure  2.4-5  when  the  channel  error  rate  is 
fixed  at  several  values  and  the  number  of  blocks  of  a  (63,45)  BCH  code  is 
a  variable  from  0  to  5.  The  performance  of  the  system  is  also  plotted 
in  Figure  2.4-6  when  the  number  of  blocks  protected  from  channel  errors 
is  fixed  at  some  values  and  the  channel  error  rate  varies  from  0  to  10"J. 

As  it  is  noted  from  Figure  2.4-5,  the  performance  of  the  ATC  system  im¬ 
proves  rapidly  as  the  number  of  blocks  (or  number  of  bits)  to  be  protected 
increases  at  the  channel  error  rate  10‘2.  As  the  number  of  blocks  reaches 
3,  the  performance  of  the  ATC  system  saturates,  while  the  best  perfor¬ 
mance  (denoted  by  A  in  the  Figure  2.4-5)  is  obtained  when  4  blocks 
of  a  (63,45)  BCH  code  are  employed  at  an  error  rate  of  10"2.  At  the  channel 
error  rate  5  x  10"3,  the  best  performance  is  obtained  when  the  3  blocks  of 
a  (63,45)  BCH  code  are  applied  to  protect  the  important  bits  of  the  ATC 
system.  At  the  channel  error  rate  10” 3,  the  protection  of  one  block 
(45  bits)  is  sufficient  to  obtain  the  best  performance  of  the  system.  It 
is  not  necessary  to  protect  any  bits  in  order  to  reduce  the  effects  of 
channel  errors  when  the  error  rate  is  lower  than  5  x  10'4. 

For  the  real-time  implementation  of  the  ATC  system,  the  recommendation  was 
to  use  the  3  blocks  of  a  (63,45)  BCH  code  because  of  the  computation  time 
and  the  saturation  of  the  ATC  system's  performance.  This  conclusion  was 
reached  from  the  Figure  2.4-6,  since  the  performance  of  the  ATC  system, 
when  3  blocks  of  a  (63,45)  BCH  code  is  employed  to  reduce  the  effects  of 
channel  errors,  is  consistent  and  high  for  the  channel  error  rates  below 
10' 7  .  The  degradation  of  the  performance  due  to  the  channel  errors  of 
rate  up  to  10~2  is  less  than  ?76  dB.  Informal  listening  tests  indicate 
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that  the  speech  quality  of  9600  bps  is  high  and  consistent  in  the  presence 
of  channel  errors  of  rate  up  to  10"2. 

2.4.5  Summary 

The  ATC  system  had  its  transmission  data  sent  through  a  Gaussian  noise 
channel  to  investigate  effects  of  random  channel  errors.  The  degradation 
of  the  speech  quality  or  the  signal -to-quantization  noise  ratio  does  not 
degrade  significantly  when  the  channel  error  rate  is  lower  than  10-3. 
However,  the  degradation  of  speech  quality,  when  the  channel  error  rate 
is  higher  than  10"3,  is  so  severe  that  one  must  reduce  the  effects  of  the 
channel  errors.  Since  the  performance  of  the  ATC  system  has  been  insen¬ 
sitive  with  respect  to  the  changes  of  the  source  information  data  rate, 
it  was  possible  to  employ  the  error  correcting  code  to  reduce  the  effects 
of  the  channel  errors. 

BCH  codes  were  briefly  described  and  were  used  to  improve  performance 
with  channel  errors.  The  selection  and  protection  of  the  important  bits 
in  the  ATC  system  were  conducted  by  utilizing  small  block  lengths  (i.e., 

63  or  127)  of  BCH  code  which  correct  up  to  3  errors  in  the  block.  As  a 
result,  the  selection  of  the  important  bits  in  ATC  system  were  made  by 
the  technique  of  "horizontal  protections."  The  (63,45)  BCH  code  was  also 
selected  to  reduce  the  number  of  periods  which  contain  a  large  amount  of 
signal  distortion  caused  by  a  large  number  of  burst  errors  (>4)  in  the 
protected  block. 

The  optimum  performance  of  the  ATC  system  was  obtained  for  a  given 
channel  error  rate.  For  example,  the  optimum  performance  of  the  ATC 
system  (15.95  dB  of  SNR)  was  obtained  when  the  4  blocks  of  a  (63,45) 

BCH  code  are  incorporated  to  reduce  the  effects  of  the  channel  errors 
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of  the  rate  10"2.  However,  we  recommended  using  3  blocks  of  a  (63,45) 

BCH  code  for  the  protection  of  channel  errors  up  to  the  rate  10“2  because 
of  the  saturation  of  the  ATC  system's  performance  and  real-time  computa¬ 
tion  capability. 

Finally,  based  on  the  SNR  performance,  we  have  shown  that  the  ATC  sys¬ 
tem  designed  in  this  study  produces  a  high  and  consistent  quality  of 
speech  at  the  data  rate  9600  bps  in  the  presence  of  random  channel  errors 
up  to  the  rate  10" 2 . 
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2.5  FORTRAN  Program  for  the  Simulation  of  the  ATC  System 

The  ATC  scheme  developed  in  the  previous  sections  is  programmed  in 
FORTRAN,  and  the  simulations  of  the  ATC  scheme  are  performed  by  a  PDP-11 
computer  with  a  RSX-11M  operating  system. 

The  FORTRAN  program  will  be  described  first  in  section  2.5.1.  The 
task  building  of  the  program  from  the  source  file  and  the  operation  of  the 
program  will  be  shown  in  section  2.5.2.  Appendix  C  contains  a  source  list¬ 
ing  of  all  FORTRAN  programs  for  the  ATC  simulation. 

2.5.1  FORTRAN  Program  of  the  ATC  Algorithm 

The  ATC  algorithm  was  developed  using  the  FORTRAN  programs  before  the 
real-time  implementation  of  the  ATC  scheme  began  on  the  MAP-300  of  CSPI, 
Inc.  The  flow  diagram  of  the  algorithm  is  shown  in  Figure  2.5-1.  The 
programs  consist  of  the  main  routine  (ATC  70)  and  25  subroutines.  The 
functional  descriptions  of  the  program  will  be  given  following  the  flow 
diagram  of  Figure  2.5-1. 

First,  the  parameters  of  the  ATC  system  are  defined  in  the  initial  setup 
routine  wi  thin  ATC  70.  The  quantizer  tables  for  the  sideband  parameters  and  DCT 
coefficients  are  also  defined  in  this  routine.  The  use  of  the  pitch 
weighting  function,  the  sorting  techniques  (fast  but  approximate  or  slow 
but  exact),  input  speech  sampling  rate,  simulation  channel  error  rate,  and 
the  information  of  the  input/output  speech  file  (name  of  the  file  and  the 
storage  device  of  the  system,  etc.)  are  defined  in  the  subroutine  0VR2. 

The  characteristics  of  the  ATC  system  are  defined  here  and  the  program 
is  ready  to  execute  over  and  over  until  it  is  terminated. 

The  number  of  frames  that  are  processed  by  the  program  is  updated  in 
the  main  program  (labeled  Ml  on  Figure  2.5-1)  and  the  input  speech  is  read 
and  buffered  by  the  subroutine  TAPE3.  The  mean  and  variance  of  the  input 
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FIGURE  2.5-1:  FLOW  DIAGRAM  OF  THE  ATC  FORTRAN  PROGRAM 


signal  are  calculated  for  the  normalization  in  the  main  program  at  M2,  and 
the  discrete  cosine  transform  (DCT)  is  performed  on  the  normalized  input 
siqnal  in  the  subroutine  0VR1.  The  vectors  are  shuffled  in  the  main  routine 
at  M3  for  the  fast  calculation  of  the  pseudo  autocorrelation  function  (ACF) 
of  the  input  speech  signal  which  is  performed  in  the  subroutine  OVEVOD. 

The  pseudo-ACF  is  searched  for  a  maximum  to  obtain  the  pitch  period, 

M.  The  corresponding  pitch  gain,  G,  is  the  ratio  of  the  pseudo-ACF  at  M 
over  its  value  at  the  origin.  The  pitch  period,  M,  and  the  pitch  gain, 

G,  are  determined  in  the  main  routine  at  M4.  The  PARCOR  coefficients  are 
calculated  from  the  normalized  pseudo-ACF  in  the  subroutine  0VR3.  The 
quantizations  and  dequantizations  of  the  PARCOR  coefficients  and  pitch 
period,  pitch  gain,  DC  bias  and  variance  of  the  input  speech  signal  are 
performed  in  the  main  programat  M5.  The  LPC  filter  coefficients  are  cal¬ 
culated  from  the  PARCOR  coefficients  in  the  subroutine  0VR4,  and  the  gen¬ 
eration  of  the  LPC  excitation  source  is  performed  in  the  main  programat  M6. 
OFT  is  performed  on  the  LPC  excitation  source  to  get  the  LPC  basis  spectrum 
in  the  subroutine  0VR5,  and  this  spectrum  is  normalized  in  the  main  routineat 
M7.  The  pitch  weighting  function  is  generated  in  the  subroutine  PITWT, 
and  it  is  multiplied  to  the  LPC  basis  spectrum  to  form  the  ATC  basis  spec¬ 
trum  in  the  main  program  at  M8. 

The  bit  assignments  rule  is  derived  from  the  basis  spectrum  and  the 
quantizations  of  the  DCT  coefficients  are  performed.  First,  the  basis 
spectrum  is  sorted  in  descending  magnitude  in  the  subroutine  0VR6  (fast 
sorting  routine)  or  0VR7  depending  on  the  terminal  input.  The  fast  sorting 
routine  may  provide  an  approximate  result  of  the  slow  but  exact  sorting 
routine  0VR7.  The  bit  assignments  routine  and  the  quantizations  of  the  DCT 
coefficients  are  performed  in  the  main  programat  M9.  The  quantized  decimal 


2-56 


inputs  are  serialized  into  binary  vector  in  the  subroutine  0VR8,  and  the 
encoding  of  a  (63,45)  BCH  code  is  performed  in  the  subroutine  0VR9. 

The  encoded  binary  data  is  then  transmitted  through  the  simulated  noisy 
channel  in  which  the  information  of  the  transmitter  may  be  altered  due  to 
the  introduction  of  the  channel  errors.  Simple  tests  are  performed  in  the 
main  routines  at  M10,  Mil,  M12,  and  M13. 

At  the  receiver  side,  the  received  sequence  of  binary  data  is  fed  to 
the  decoder  routine  for  the  correction  of  errors  if  any  in  the  subroutine 
0VR10.  The  sideband  information  is  obtained  first  by  unpacking  the  corrected 
binary  vector  in  the  subroutine  0VR11.  The  sideband  information,  which  con¬ 
sists  of  PARCOR  coefficients,  pitch  period,  pitch  gain,  mean  and  variance 
of  the  input  signal,  is  dequantized  in  the  main  routine  at  M14.  In  order  to 
generate  the  basis  spectrum,  LPC  filter  coefficients  are  calculated  from 
the  PARCOR  coefficients  in  subroutine  0VR4,  and  the  time  domain  exita- 
tion  source  for  the  LPC  spectrum  is  performed  in  the  main  routine  at  Mlb. 

The  LPC  spectrum  is  generated  in  the  subroutine 0VR5,  and  the  pitch  weighting 
function  is  calculated  in  the  subroutine  PITWT  for  the  case  of  voiced  sounds.  The 
ATC  basis  spectrum  is  obtained  by  the  mul  tipi  ication  of  the  LPC  basis  spectrum  and  pitch 
weighting  function  in  the  main  programat  M16  and  M17.  The  basis  spectrum  is  again  sorted 
in  descending  magnitude  in  the  subroutine  0VR6  or  0VR7.  The  bit  assignments  rule  is  ex¬ 
ercised  again  from  the  sorted  basis  spectrum  in  the  main  program  at  M18.  The  ma inband 
information  is  obtained  by  unpacking  the  received  binary  data  in  the  subroutine 
0VR12.  The  dequantizations  of  the  DCT  coefficients  are  performed  in  the  main 
program  at  M19,  and  the  inverse  DCT  is  performed  to  reproduce  the  time  domain 
signal  in  the  subroutine  0VR1. 

The  time  domain  signal  is  renormalized  by  the  mean  and  variance  of 
the  input  signals  and  interpolated  in  order  to  reduce  the  effects  of  the 


1 


2-57 


signal  discontinuities  at  the  frame  boundaries  in  the  main  routine  at  M20.  This 
reproduced  signal  is  fed  to  the  output  device  in  the  subroutine  TAPE3,  and 
the  post  analysis  (measuring  the  signal-to-noise  ratio)  is  performed  in  the 
main  program  at  M21.  These  procedures  are  repeated  until  the  desired  number 
of  frames  are  processed  by  the  program. 

2.5.2  Task  Building  of  the  ATC  Program 

A  magnetic  tape  was  sent  to  DCA  containing  all  of  the  source  files  nec¬ 
essary  to  build  the  ATC  program. 

Also  included  was  10  frames  of  data  with  zero  and  one  percent  error 
rates,  that  are  shown  in  Figure  2.5-2  and  Figure  2.5-3,  respectively. 

The  task  module  of  the  ATC  program  can  be  built  as  follows: 

I)  CATC70.CMD  is  an  indirect  conmand  file  that  compiles  all  source  needed 
for  taskbuilding  the  overlay  ATC  program.  It  also  purges 
all  old  object  files.  It  also  spools  the  overlay  descriptor 
language  program. 

*Note  that  ATC70  is  compiled  with  the  slash  DE  option 
which  allows  for  printout  to  LUN  4  all  diagnostics  in  the 
ATC  main  program.  This  requires  a  larger  compiler  parti¬ 
tion  and  also  requires  assignment  of  LUN  4  to  a  system 
device  upon  installation  of  the  main  program.  Therefore, 
if  diagnostics  are  undesired,  do  not  compile  with  the  /DE 
option.  However,  there  is  a  rather  elegant  set  of  Diagnos¬ 
tics,  not  to  be  passed  over  in  haste. 

This  program  is  invoked  by  typinq  (in  MCR) 

0  CATC. 70 


2-58 


ATCOLA.CMD  task  builds  the  overlay  descriptor  program  creating  the 
task  ATC70.  If  a  map  is  desired,  then  add  the  LP  option 
in  the  8UILD.CMD  program. 

This  program  is  invoked  by  typing  (in  MCR) 

TKB  @  ATCOLA 

The  program  executes  by  typing 
RUN  ATC70 

as  shown  in  Figure  2.5-2. 
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FIGURE  2.5-2  EXAMPLE  OPERATION  OF  THE  ATC  PROGRAM 


RUN  ATC70 

HUAI  I  Wt:  HO  SER  I  ALIZATION  OF  DATA(  Y'N)Y 
SHAII  WE  INSET  T  ERRORS  IN  CHANNEL ( Y /N ) ?Y 

ERROR  RATE  IN  111.4= 

t .  e  ; 

SHALL  WE  ENCCODE  AND  DECODE ( Y/N ) ?Y 
USE  P  I  TCH  WE  I GHT I NG ( Y/N  >  ?  Y 
USE  EAST  SORT (Y/N I?  Y 

SAMI  I  INC.  RATE  AND  XMIT  DATA  RATE  IN  216=6400,9600 
IS  III!.  INPUT  UN  MAG.  TAPE?  N 
IS  ii'i  OUTPUT  GOING  TO  MAG  TAPE?  N 
GUI  PUT  FILE  NAME=  NL.5 
INN:  I  PILE  NAME-  SPEECH . DAT 
NO  FRAMES'3  10 

!  OAI  NUMBER  OF  BITS*  369 
I  N.l  I  I.  Til  F  ROME  . 
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FIGURE  2.5-3  EXAMPLE  OPERATION  OF  THE  ATC  PROGRAM 
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2.6  Summary  and  Conclusions 

The  ATC  optimization  studies  resulted  in  an  ATC  system  which  does  not 
degrade  significantly  with  a  BER  of  10"2  at  a  data  rate  of  9600  b/s.  The 
specifications  for  the  optimized  system  are  shown  in  Table  2.6-1.  The 
actual  quantization  tables  can  be  found  in  either  the  FORTRAN  listing  of 
Appendix  D  or  in  tables  within  section  3  of  Volume  2  describing  the  real¬ 
time  MAP  software. 

The  voice  quality  produced  by  the  9600  b/s  ATC  simulations  is  the  best 
of  any  technique  now  known  to  GTE.  The  technique,  however,  is  numerically 
complex  requiring  the  complete  processing  capability  of  the  CSP,  Inc. 
MAP-300  floating  point  processor.  Thus,  for  ATC  to  be  practical,  either 
higher  speed  hardware  must  be  built  or  the  technique  must  be  simplified. 

Future  speech  digitization  development  at  9600  cannot  ignore  the  ATC 
algorithm  because  even  though  the  technique  is  complex,  it  shows  that 
good  quality  speech  is  possible  at  this  data  rate.  Thus,  the  ATC  technique 
developed  under  this  contract  will  serve  as  a  benchmark  or  standard  to  com¬ 
pare  all  new  9600  b/s  speech  digitization  algorithms. 
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PARAMETER  SPECIFICATION 


Input  Bandwidth 

0-3200  Hz 

Sampling  Rate 

6400  Hz 

Frame  Rate 

26.016/sec. 

Number  of  Samples/Frame 

246 

Number  of  Samples  Overlapped/Frame 

10 

Bits/ Frame 

369 

Pitch 

f" 6  i  f  vo  i  ced 
(^0  if  unvoiced 

Pitch  Gain 

J2  if  voiced 
To  if  unvoiced 

Voiced/Unvoiced 

1 

RMS  Energy 

5 

DC  BIAS 

5 

PARCOR  1 

5 

PARCOR  2 

5 

PARCOR  3 

4 

PARCOR  4 

4 

PARCOR  5 

3 

PARCOR  6 

3 

PARCOR  7 

2 

PARCOR  8 

2 

Parity  Bits  (Error  Correction) 

54 

SYNC 

1 

DCT  Coefficients  j 

P 267  voiced 
[^275  unvoiced 

Number  of  Error  Control  Blocks/Frame 

3 

Error  Control  Technique 

(63,45)  BCH 

TA8LE  2.6-1:  OPTIMIZED  ATC  SYSTEM  SPECIFICATION 
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Appendix  A  Primitive  BCH  Codes 


The  BCH  codes  described  in  this  appendix  are  cyclic  codes  that  are 
well  defined  in  terms  of  the  roots  of  the  generator  polynomials  £lj . 
These  codes  were  discovered  by  Bose  and  Chaudhuri  Q2j  -  and  separ¬ 
ately  by  Hocquenghem  [4j  .  A  binary  (n,k)  BCH  code  word  consists  of  n 
symbols  (bits  in  the  binary  case)  where  the  first  k  bits  are  the  infor¬ 
mation  bits  and  the  remaining  r  =  n-k  bits  are  redundant  parity  checks. 
It  is  convenient  to  represent  code  words  with  polynomials  as 

f ( x )  =  fQ  +  fjX  +  ...  +  f^x0"1  ,  fi  =  0  or  1  (Al) 

where  each  bit  position  is  associated  with  a  locator.  If  f(x)  is  a  code 
word,  then 


fjM  -  f,  ♦  f2  x  *  ...  +  fn_j  x"'2  -t  fQ  x"_1  (A2) 

is  also  a  codeword  in  a  cyclic  codes.  In  the  primitive  BCH  code,  which 
is  the  most  convenient  and  powerful  BCH  code  in  theory  and  practice,  the 
block  length  of  the  code  may  be  defined  as 

n  =  2m  -  1  (A3) 

and  with  mt  parity  checks,  it  can  correct  any  set  of  t  independent  errors 
within  the  block  of  n  bits,  where  m  and  t  are  arbitrary  positive  inte¬ 
gers  Q5] .  This  code  may  be  described  conveniently  with  the  aid  of 
finite  Galois  field  theory  introduced  in  Appendix  B. 
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Let  a  be  a  primitive  element  of  the  finite  field  GF(2m),  then  the 
primitive  BCH  code  may  be  described  as  the  set  of  polynomials  such  that 

f(a)  =  0,  i  =  1,  3,  5,  .  2t  -  1  (A4) 

It  is  known  in  coding  theory  that  these  polynomials  consist  of  all  multi¬ 
ples  of  a  single  polynomial  g(x),  known  as  the  generator  polynomial. 

This  polynomial  also  satisfies  the  equations  as 

glct1)  =  0,  i  =  1,  3,  5,  - -  2t  -  1  (A5) 

These  generator  polynomials  are  tabulated  in  Table  A  for  the  selected 
primitive  BCH  codes. 


Encoding  Procedures 

Let  the  k  information  bits  be  represented  by  the  polynomial  d(x)  as 


d(x)  =  Z  d,  x 
i=0  1 


(A6) 


then,  the  code  word  of  n  bits  may  be  expressed  as 


f(x)  =  xn  k  d(x)  +  r(x) 


(A7) 


where  r(x)  is  the  remainder  (parity  check)  obtained  according  to  the 
f o 1  lowing  equation: 


d(x)  =  q(x)  +  r_M 
9(x)  g(x) 


(A8) 
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Block  length  n 

k 

t 

Generator  Polynomial 

63 

57 

1 

gi(x)  =  (6,  1,  0)  =  x6  +  x  +  1 

51 

2 

gs(x)  =  g i  ( x )  •  (6,  4,  2,  1,  0) 

45 

3 

9s(x)  =  g3(x)  •  (6,  4,  2,  1,  0) 

127 

120 

1 

g 1 {x )  =  (7,  3,  0)  =  xv  +  x3  +  1 

j 

113 

2 

g a ( x )  =  gi(x)  •  (7,  3,  2,  1,  0) 

i 

_ 

106 

3 

9s(x)  =  g 3 ( x )  •  (7,  4,  3,  2,  0) 

. 

255 

247 

1 

gi{x)  =  (8,  4,  3,  2,  0)  =  x8  +  x4  +  x3  +  x2  +  1 

239 

2 

gs(x)  =  g  1  ( x )  •  (8,  6,  5,  4,  2,  1,  0) 

231 

3 

gs(x)  =  g 3 ( x )  •  (8,  7,  6,  5,  4,  2,  0) 

TABLE  A:  GENERATOR  POLYNOMIALS  FOR  SELECTED  PRIMITIVE  BCH  CODES 
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where  g(x)  is  the  generator  polynomial  of  the  code.  Therefore,  encoding 
can  be  performed  by  the  following  procedures: 


n  L 

1) .  Calculate  x  d(x)  by  left  shifting  the  information  bits 

n-k  times 

2) .  Calculate  the  remainder  (parity  bits)  r(x)  from  the  divi¬ 

sion  of  x11  ^  d(x)  by  g(x) 
n  “  k 

3) .  Add  the  polynomial  x  d(x)  and  r(x)  to  form  the  code 

word 

The  procedures  of  1)  and  3)  can  be  done  simply  by  shifting  and  addition. 
However,  the  procedure  of  2)  is  rather  involved  in  computation  if  the 
actual  division  is  performed  to  get  the  remainder.  If  the  BCH  code  is 
specified  and  it  is  desired  to  speed  up  the  processing  time  of  2),  it  is 
recommended  to  use  a  look-up  table  procedure  for  the  calculation  of  the 
remainder  from  2).  The  code  word  is  then  transmitted  through  the  noisy 
channel,  where  the  received  code  word  may  be  altered  depending  on  the 
introduction  of  channel  errors. 

Decoding  Procedures 

There  are  several  algorithms  for  a  decoding  of  BCH  codes.  Efficient 
decoding  algorithms  have  been  discovered  for  BCH  codes  Q3  “  [/J  •  The 
Berlekamp  decoder  is  particularly  attractive  for  powerful  codes  that  pro¬ 
vide  for  a  good  deal  of  error  corrections  (e.g.,  10  or  more).  The 
Peterson  algorithm,  however,  is  more  efficient  for  less  powerful  codes 
(e.g.,  the  codes  used  in  generalized  burst  trapping).  In  this  decoding 
procedure,  the  problem  of  finding  efficient  solutions  to  the  key  decoding 
equation  will  be  addressed  by  using  the  Peterson  technique. 
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When  a  BCH  code  word  { f ( x ) }  is  transmitted  over  a  noisy  channel,  this 
code  word  may  be  corrupted  by  the  channel,  and  what  is  received  {y ( x ) }  can 
be  different  from  the  intended  code  word.  Thus,  the  received  word  may  be 
expressed  as 

y(x)  =  f(x)  +  e(x)  ( A9) 

where  e(x)  is  the  error  polynomial  which  a  decoder  must  compute  to  correct 
errors  introduced  by  the  channel.  Let  the  received  data  be  expressed  in 
vector  y  as 


Y  =  [y0>  Yj.  •••.  Y^]  ( A10) 

or  its  associated  polynomial  y(x)  by 

y(x)  =  yq  +  YjX  +  ...  +  Yn.jx"'1  (All) 

Denote  each  of  the  error  location  numbers  by  0^  ,  j  =  1,  2,  ...,  t,  then 
it  is  shown  Ql]  that  the  power  sums  S-  can  be  expressed  as 

Si  =  Y(ai) 

t  i 

=  L  3  . 1  ,  i  =  1,  3,  5,  ...,  2t-l  (A12) 

j  =  l  J 

In  order  to  find  the  error  locations,  the  Peterson  procedures  consist  of 

three  steps: 
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Step  1:  Compute  the  power  sums  from  the  received  sequence  through 
the  relations 


Si  =  y(n),  i  =  1,  3,  5,  ....  2t-l 


(A13) 


Step  2:  Compute  the  symmetric  functions  o^,  k  =  1,  2,  ....  t  from 
the  power  sums  S^ ,  i .e. , 

o(x)  =  x*"  +  a^x*"  *  +  ....  +  Oj._j  x  +  o^. 

( A14 ) 

=  (x  +  Sj  )  ( x  +  8  2  )  ...  (x  +  3t  ) 

and  the  ok's  may  be  obtained  by  the  use  of  Newton's  identi- 
ties  [lj 
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If  the  determinant  of  Mt  is  singular,  then  reduce  the  error 
number  t  by  2  and  proceed  with  it  again. 

Step  3:  Find  the  error  position  locator  j  =  1,  2,  ....  t,  which 
is  the  roots  of  the  polynomial  o(x)  in  eq.  (A14). 

An  efficient  algorithm  for  calculating  the  :3  j  *  s  from  eq.  (A14)  has  been 
developed  by  Chien  ["5”],  and  all  that  remains  to  completely  specify  a 
binary  BCH  decoder  is  the  computation  of  the  coefficients  of  error  locator 
polynomial,  a j ' s .  As  it  is  noted  from  eq.  (A15),  the  calculation  of  the 
Oj's  involved  matrix  inversion  which  can  be  expressed  analytically  for  the 
case  t  <  3.  The  results  are: 

For  t  =  1, 

°1  =  S1 

For  t  =  2, 

°1  =  S1 

°2  =  ^S3  +  S1  )/S] 

For  t  =  3, 

°1  S1 

°2  *  (Sl2  S3  4  S5>/(S13  *  S3> 

°3  =  ^1^  +  ^  +  ^I°2 

The  calculation  of  the  o.'s  and  the  estimation  of  the  error  number  are 
shown  in  Figure  Al  for  t  -  3.  The  flowchart  of  Chien's  search  decoding 
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in i  (a)  -  a6  +  a  +  1 
=  0 

=  a63  +  1 


received 

y(x)=Yo  +  Yix  + 

code  word 
...*rn.i  x"’1 

’ 

_ 

Compute  Power  Sums  Si,  S3,  S5 
Si  =  y(o^  ) ,  i  =  1 ,  3,  5 

Add  Power  Sums 
APS  =  Si  +  S3  +  S5 


oi  =  0 

Y 

o2  -  0 

✓  » 

No  Error 

O 

II 

m 

D 

Exist 

NES  =  0 

Compute  Determinant 
DET  =  S,3  +  S3 


0 1  =  Si 

Y 

Q 

fO 

II 

0 

1  Error 

0  3=0 

Exist 

NES  =  1 

/!\ 

V  03  -  Vs 

Y 

NES  =  2 

< 

2  Error 

Exist 

T 

COMPUTATION 

NES  =  3 

OF  01,  02,  03 

3  or  More  Error  Exist 
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procedure  is  shown  in  Figure  A2.  This  flowchart  is  for  t  =  3,  i.e.,  the 
decoding  algorithm  can  correct  errors  up  to  3.  One  interesting  observa¬ 
tion  in  this  decoding  procedure  is  that  the  correction  of  errors  may  be 
performed  erroneously  if  the  number  of  errors  in  the  block  is  greater 
than  3.  Hence,  the  corrections  may  introduce  additional  channel  errors. 
In  order  to  avoid  these  additional  errors,  error  corrections  are  made 
only  when  the  estimated  error  number  (NES  in  Figure  Al)  equals  to  the 
measured  error  number  (K  in  Figure  A2).  This  procedure  eliminates  most 
of  the  additional  errors  when  more  than  3  errors  exist  in  the  received 
word.  In  other  words,  the  detection  of  errors  more  than  3  (i.e.,  4,  5, 

6,  ...,  etc.)  is  feasible  most  of  the  cases.  This  fact  contributes  some 
improvements  of  the  coder  performance  when  the  channel  is  very  noisy  (bit 
error  rate  ~10"‘). 
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FIGURE  A2 :  CHIEN'S  SEARCH  DECODING  PROCEDURE 
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Appendix  B  Operations  in  Galois  Field 

A  Galois  field  is  a  finite  set  of  elements  that  satisfy  the  axioms 
of  a  general  field.  Two  operations  (addition  and  multiplication)  and 
their  inverses  are  defined  on  the  field  elements.  There  is  an  identity 
element  for  each  field  element  for  both  of  the  operations  (0,  1)  that  is 
itself  in  the  field.  Also,  both  addition  inverses  and  multiplication  in 
verses  are  in  the  field.  Finally,  the  rules  of  commutation  and  associa¬ 
tivity  are  obeyed  by  the  elements  of  the  field. 

Consider  the  following  sixteen  polynomials  and  their  vector  binary 
representations. 


0 

0000 

1 

0001 

1  +  X 

0011 

1  +  X  +  X2 

0111 

1  +  X  +  X2  +  X3 

mi 

X 

0010 

X  +  X2 

0110 

X  +  X2  +  X3 

1110 

X2 

0100 

X2  +  X3 

1100 

X3 

1000 

1  +  X3 

1001 

1  +  X2 

0101 

1  +  X2  +  X3 

1101 

X  +  X3 

1010 

1  +  X  +  X3 

1011 
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As  long  as  addition  and  multiplication  of  these  polynomials  is  defined  so 
that  the  axioms  for  the  field  are  obeyed,  then  this  will,  in  fact,  be  a 
Galois  field  of  24  elements  (GF(24)). 

Addition  is  defined  to  be  modulo  2.  Each  element  is  its  own  additive 
inverse  and  addition  and  subtraction  of  elements  are  the  same. 

Multiplication  must  be  defined  so  that  the  product  of  two  elements 
does  not  take  us  out  of  the  field.  For  this  reason,  multiplication  in  a 
Galois  field  is  not  ordinary  multiplication  of  polynomials.  Rather,  multi¬ 
plication  is  defined  modulo  an  irreducible  polynomial,  the  primitive  poly¬ 
nomial  of  the  Galois  field.  For  our  field  GF(2m),  the  primitive  polynomial 
is  1  +  X  +  X".  To  generate  the  16  vectors  in  the  field,  all  one  needs  to 
do  is  to  divide  Xm  where  m  =  0,  1,  ...  14  by  the  primitive  polynomial. 


0 

-1 

1 

0 

X 

X 

x- 

X2 

X3 

X3 

1 

_RD 

+  xj 

x4 

1  +  X  1  +  X  +  X4 

|  X4 

V 

X  +  X2 

X 

r£x 

+  xH 

X6 

x 2  +  x 3  1  +  x  +  x-  : 

|  X  +  X 

X7 

X3  +  X  +  1 

X8 

X2  +  1 

X9 

X3  +  X 

X10 

X2  +  X  +  1 

X111 

X3  +  X2  +  1 

X12 

X3  +  X2  +  X  +  ] 

X13 

X3  +  X2  +  1 

X14 

X3  +  1 

X19 

X°  +  1 
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It  is  now  seen  that  the  product  of  two  binary  vectors  in  the  field  is 


just  the  sum  of  their  powers.  The  table  repeats  every  fifteen  powers  so 
it  is  all  done  modulo  15. 

xi  +  XJ  =  Xi+J  (mod 

Y  ^ 

_  _  xi-j  (mod  15) 

XJ 


2-80 


APPENDIX  C 


FORTRAN  Source  Listings  for  the  ATC  Simulation 

This  appendix  contains  the  FORTRAN  source  programs  for  the  ATC  simula¬ 
tion.  The  first  page  of  this  listing  is  a  compile  file  which  uses  F0RTRAN-1V 
PLUS  to  generate  object  files  from  the  source  files  and  which  sends  listings 
to  the  line  printer.  The  second  page  of  these  programs  is  the  overlay 
description  language  (ODL)  needed  to  build  the  ATC  task  under  the  RSX-11M 
operating  system.  The  remainder  of  the  appendix  includes  the  main  program 
and  subroutines.  The  order  of  the  programs  follows  the  order  of  the  files 
as  listed  in  the  overlay  description  language  on  the  second  page. 
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